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The nucleotide sequence between the peplomer and matrix protein genes in the 
genome of the Purdue strain of porcine transmissible gastroenteritis coronavirus 
(TGEV) was determined by sequencing parts of six cDNA clones. Open reading 
frames potentially encoding proteins of 7,711, 27,711, and 9,241 Da were identified 
(Fig. 1). The sequence for this region of the genome for the same strain of virus was 
published by Rasschaert et al. (3), but our sequence differs by two bases, one of 
which results in a major change in the properties of the second open reading 
frame (ORF). G in our sequence at position 433, rather than T, enlarges the second 
open reading frame from 165 to 244 amino acids and establishes a sequence con- 
text more favorable for initiation of translation (4). C in our sequence at position 
606, rather than T, changes Leu to Pro. Each open reading frame is preceded by a 
sequence that is similar to the CTAAAC intergenic sequence thought to be re- 
quired for leader-rimed transciption (1,3). The enlargement of the second open 
reading frame from 165 amino acids (18,833 Da) to 244 amino acids (27,711 Da) 
resolves two concerns raised by Rasschaert et al.(3) namely a) the second ORF in 
their sequence would require initiation of translation 570 bases downstream from 
the CTAAAC intergenic sequence (or 249 bases downstream of the CTAAAT se- 
quence that we propose is used), which is an unusually long distance. b) The sec- 
ond ORF in their sequence is not large enough to encode the 24 kD polypeptide 
translated in vitro from TGEV mRNA 3 by Jacobs et al. (5), an mRNA 
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1 TAAATTTAAAATGTTAATTCTATCATCTGCTATAATAGCAGT TGTTTCTGCTAGAGAAT TT TGT TAAGGATGATGAATAAAGT CTT TAAGAACTAAACTT 


101 TCATTACAGGTCCTGTATGGACAT TGTCAAATCCATTTACACATCCGTAGATGCTGTACTTGACGAACTTGATTGTGCATACT I TGCTGTAACTC 
cae i a A aA aD Mi ae ae A a 


201 TTAAAGTAGAATTTAAGACTGGTAAAT TACT TGTGTGTATAGGTT I TGGTGACACACT TCT TGCTGCTAAGGATAAAGCATATGCTAAGCT TGGTCTCTC 
[TARAGTAGAATTARGAGTGGTABAT TACT TGPGTRTATAGRTTETGRTGACAGACT TCT TGETORTARGGRTARAGEATSTGRTARGCT TogrerCTE 


30 


CATTAT TGAAGAAGT CAATAGTCATATAGITGITTAATATCAT TAAACACACAAAACCCAAAGCATTAAGTGTTACAAAACAATTAAAGAGAGATTATAG 


401 AAAAACTGTCATTCTAAATTCCATGCGAAAAT GAT TGGTGGACT TTT TCTTAGTACTCTGAGTITTGTAAT TGTTAGTAACCATTCTATTGTTAATAACA 
FAITE GA] TSGTGGACT TT ETC TARTARTCT GAGTTETGTAATTGUTAGTARCCATTETATTGY TARTAN 


50 


qaaHa 
TGPACKCTITGUATGTTITCTAGETT IGTACCATAGTAGAAACT I TASGAEGTATGYCGGCAT CTI AATGTITAAGATTTTATEAATGACACTTTTAGGA 
CETATGCTTATAGEATATGGT TACTACATTGATGGCAT TG] TAGAAGAAGTGUCTIATETTTAAGATITGICTACTIAGEATACTETTEGTATGLTARTA 


TGTGCATCATATACAACAAGAACGTGT TATAGT ACAACAGCATCAGGT GT TAGTGCTAGAACACAAAAC TAT TACCCAGAGT TCAGCATCGC 
GACRARATEYGCATCATATACAACAAGRACRTGUTATASY YT GPTAGTGRTARAREACAAARCTOTTACCEAGAGT TCARCAT CGS 


60 


70 


80 


QTAGGTETGRATITATT TTATACAATAGAAEGAGACTCAT GT i TGTACAT GGCAGAGETEEACEGT PIATGAgAAGT T g CACAGCT! grat T Ter GTCAG 
90 


TGGTGGCATAAATTATATGTTTGTGAATGACCTCACGT IGCATTTTGTAt TATGCT TGTAAGCATAGCAATACGTGGCTTAGCTC: T 
AE ee ca Ya Oh A ak a A a i i lM a 
1001 GATCTAAGTGTACTTAGAGEAGT TGAACT TCTCAATGGTGATTITATTTATGUAT IT TEACAGGAGCECGIAGTCGRTGTTTACAATGEAGECTITTETE 
1101 AGGEGGTTCTAAACGAAAT TGACT TAARAGAAGAAGAAGAAGACCATACCTATGACGTTTECTAGGGCATTGACTGTCATAGATGACAATGGAATGGTCA 
Q VT WE I DL K EE E E DH Y DV 
MT FPRALCTVYIbDDN G&G MY 
1201 TTAACA YGGTICCTGTTGATAATTATATIGATATTACTT TCAATAGCAT TGCTAAATATAAT TAAGCTATGCATGGTGTGT TGCAATTTAGG 
Ae ca bce APC alt a a as Cad Gal lS Re et ld Wa i 
1301 AAGGA' TATTGTTCCAGCGCAACATGCT TACGATGCCTATAAGAAT TT TATGCGAATTAAAGCATACAACCCCGATGGAGCACTCCTTGCTIGA 
GGAGAGTTATTAT TCT TCEAGEGCAACATGETTACGATGECTATARGARTTETATGCEAATTARAGEATACARCCECGRTGRAGEACTCCT TGR 


1401 ACTAAACAAAATS 


Fig. I. Nucleotide sequence between the peplomer and matrix protein genes of TGEV and the deduced 
amino acid sequences for the three large open reading frames. The nucleotide sequence begins with the 
TAA stop codon of the peplomer gene (1, and unpublished data) and ends with the ATG start codon of 
the matrix protein (2). The consensus intergenic sequences are underlined. 


that maps in this region of the genome. The base differences we report were ob- 
tained from two separate cDNA clones. 
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